Evolved Matrix Operations for Post-processing Protein Secondary Structure Predictions

نویسندگان

  • Varun Aggarwal
  • Robert M. MacCallum
چکیده

Predicting the three-dimensional structure of proteins is a hard problem, so many have opted instead to predict the secondary structural state (usually helix, strand or coil) of each amino acid residue. This should be an easier task, but it now seems that a ceiling of around 76% per-residue three-state accuracy has been reached. Further improvements will require the correct processing of so-called “long-range information”. We present a novel application of genetic programming to evolve high-level matrix operations to post-process secondary structure prediction probabilities produced by the popular, state-of-the-art neural network-based PSIPRED by David Jones. We show that global and long-range information may be used to increase three-state accuracy by at least 0.26 percentage points – a small but statistically significant difference. This is on top of the 0.14 percentage point increase already made by PSIPRED’s built-in filters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In Silico and in Vitroinvestigations on cry4aand cry11atoxins of Bacillus thuringiensis var Israelensis

In the present study we attempted to correlate the structure and function of the cry11a (72 kDa) and cry4a (135 kDa) proteins of Bacillus thuringiensis var israelensis. Homology modeling and secondary structure predictions were done to locate most probable regions for finding helices or strands in these proteins. The JPRED (JPRED consensus secondary structure prediction server) secondary struct...

متن کامل

Evolved Cellular Automata for Protein Secondary Structure Prediction Imitate the Determinants for Folding Observed in Nature

We demonstrate the first application of cellular automata to the secondary structure predictions of proteins. Cellular automata use localized interactions to simulate global phenomena, which resembles the protein folding problem where individual residues interact locally to define the global protein conformation. The protein's amino acid sequence was input into the cellular automaton and rules ...

متن کامل

Prediction of Spontaneous Protein Deamidation from Sequence-Derived Secondary Structure and Intrinsic Disorder.

Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues i...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

ROTEIN structure prediction from the amino acid sequence is an fundamental and challenging problem in molecular biology. Stimulated by the difficulty of the overall structure prediction, computational methods for the prediction

State-of-the-art methods for secondary structure (Porter, Psi-PRED, SAM-T99sec, Sable) and solvent accessibility (Sable, ACCpro) predictions use evolutionary profiles represented by the position specific scoring matrix (PSSM). It has been demonstrated that evolutionary profiles are the most important features in the feature space for these predictions. Unfortunately applying PSSM matrix leads t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004